The Most Basic-est Model of Them All

They all survive


In [21]:
import csv as csv
import numpy as np

The optimistic model: They All Survive


In [22]:
test_file        = open('./data/test.csv', 'rb') # Open the test data
test_file_object = csv.reader(test_file)
header           = test_file_object.next()

In [23]:
header


Out[23]:
['PassengerId',
 'Pclass',
 'Name',
 'Sex',
 'Age',
 'SibSp',
 'Parch',
 'Ticket',
 'Fare',
 'Cabin',
 'Embarked']

Open a new model (.csv) file to write to


In [24]:
predictions_file = open('./models/jfaPythonModel-allSurvive.csv', 'wb')
predictions_file_object = csv.writer(predictions_file)

Write the columns header row


In [25]:
predictions_file_object.writerow(['PassengerID', 'Survived'])

In [26]:
for row in test_file_object:
    predictions_file_object.writerow([row[0], "1"])

In [27]:
test_file.close()
predictions_file.close()

Take a look at the resulting predictions


In [17]:
output_predictions_file        = open('./models/jfaPythonModel-allSurvive.csv', 'rb')
output_predictions_file_object = csv.reader(output_predictions_file)
data                           = []

In [18]:
for row in output_predictions_file_object:
    data.append(row[0:])
data = np.array(data)

In [19]:
data.shape


Out[19]:
(418, 2)

In [20]:
data


Out[20]:
array([['PassengerID', 'Survived'],
       ['893', '1'],
       ['894', '1'],
       ['895', '1'],
       ['896', '1'],
       ['897', '1'],
       ['898', '1'],
       ['899', '1'],
       ['900', '1'],
       ['901', '1'],
       ['902', '1'],
       ['903', '1'],
       ['904', '1'],
       ['905', '1'],
       ['906', '1'],
       ['907', '1'],
       ['908', '1'],
       ['909', '1'],
       ['910', '1'],
       ['911', '1'],
       ['912', '1'],
       ['913', '1'],
       ['914', '1'],
       ['915', '1'],
       ['916', '1'],
       ['917', '1'],
       ['918', '1'],
       ['919', '1'],
       ['920', '1'],
       ['921', '1'],
       ['922', '1'],
       ['923', '1'],
       ['924', '1'],
       ['925', '1'],
       ['926', '1'],
       ['927', '1'],
       ['928', '1'],
       ['929', '1'],
       ['930', '1'],
       ['931', '1'],
       ['932', '1'],
       ['933', '1'],
       ['934', '1'],
       ['935', '1'],
       ['936', '1'],
       ['937', '1'],
       ['938', '1'],
       ['939', '1'],
       ['940', '1'],
       ['941', '1'],
       ['942', '1'],
       ['943', '1'],
       ['944', '1'],
       ['945', '1'],
       ['946', '1'],
       ['947', '1'],
       ['948', '1'],
       ['949', '1'],
       ['950', '1'],
       ['951', '1'],
       ['952', '1'],
       ['953', '1'],
       ['954', '1'],
       ['955', '1'],
       ['956', '1'],
       ['957', '1'],
       ['958', '1'],
       ['959', '1'],
       ['960', '1'],
       ['961', '1'],
       ['962', '1'],
       ['963', '1'],
       ['964', '1'],
       ['965', '1'],
       ['966', '1'],
       ['967', '1'],
       ['968', '1'],
       ['969', '1'],
       ['970', '1'],
       ['971', '1'],
       ['972', '1'],
       ['973', '1'],
       ['974', '1'],
       ['975', '1'],
       ['976', '1'],
       ['977', '1'],
       ['978', '1'],
       ['979', '1'],
       ['980', '1'],
       ['981', '1'],
       ['982', '1'],
       ['983', '1'],
       ['984', '1'],
       ['985', '1'],
       ['986', '1'],
       ['987', '1'],
       ['988', '1'],
       ['989', '1'],
       ['990', '1'],
       ['991', '1'],
       ['992', '1'],
       ['993', '1'],
       ['994', '1'],
       ['995', '1'],
       ['996', '1'],
       ['997', '1'],
       ['998', '1'],
       ['999', '1'],
       ['1000', '1'],
       ['1001', '1'],
       ['1002', '1'],
       ['1003', '1'],
       ['1004', '1'],
       ['1005', '1'],
       ['1006', '1'],
       ['1007', '1'],
       ['1008', '1'],
       ['1009', '1'],
       ['1010', '1'],
       ['1011', '1'],
       ['1012', '1'],
       ['1013', '1'],
       ['1014', '1'],
       ['1015', '1'],
       ['1016', '1'],
       ['1017', '1'],
       ['1018', '1'],
       ['1019', '1'],
       ['1020', '1'],
       ['1021', '1'],
       ['1022', '1'],
       ['1023', '1'],
       ['1024', '1'],
       ['1025', '1'],
       ['1026', '1'],
       ['1027', '1'],
       ['1028', '1'],
       ['1029', '1'],
       ['1030', '1'],
       ['1031', '1'],
       ['1032', '1'],
       ['1033', '1'],
       ['1034', '1'],
       ['1035', '1'],
       ['1036', '1'],
       ['1037', '1'],
       ['1038', '1'],
       ['1039', '1'],
       ['1040', '1'],
       ['1041', '1'],
       ['1042', '1'],
       ['1043', '1'],
       ['1044', '1'],
       ['1045', '1'],
       ['1046', '1'],
       ['1047', '1'],
       ['1048', '1'],
       ['1049', '1'],
       ['1050', '1'],
       ['1051', '1'],
       ['1052', '1'],
       ['1053', '1'],
       ['1054', '1'],
       ['1055', '1'],
       ['1056', '1'],
       ['1057', '1'],
       ['1058', '1'],
       ['1059', '1'],
       ['1060', '1'],
       ['1061', '1'],
       ['1062', '1'],
       ['1063', '1'],
       ['1064', '1'],
       ['1065', '1'],
       ['1066', '1'],
       ['1067', '1'],
       ['1068', '1'],
       ['1069', '1'],
       ['1070', '1'],
       ['1071', '1'],
       ['1072', '1'],
       ['1073', '1'],
       ['1074', '1'],
       ['1075', '1'],
       ['1076', '1'],
       ['1077', '1'],
       ['1078', '1'],
       ['1079', '1'],
       ['1080', '1'],
       ['1081', '1'],
       ['1082', '1'],
       ['1083', '1'],
       ['1084', '1'],
       ['1085', '1'],
       ['1086', '1'],
       ['1087', '1'],
       ['1088', '1'],
       ['1089', '1'],
       ['1090', '1'],
       ['1091', '1'],
       ['1092', '1'],
       ['1093', '1'],
       ['1094', '1'],
       ['1095', '1'],
       ['1096', '1'],
       ['1097', '1'],
       ['1098', '1'],
       ['1099', '1'],
       ['1100', '1'],
       ['1101', '1'],
       ['1102', '1'],
       ['1103', '1'],
       ['1104', '1'],
       ['1105', '1'],
       ['1106', '1'],
       ['1107', '1'],
       ['1108', '1'],
       ['1109', '1'],
       ['1110', '1'],
       ['1111', '1'],
       ['1112', '1'],
       ['1113', '1'],
       ['1114', '1'],
       ['1115', '1'],
       ['1116', '1'],
       ['1117', '1'],
       ['1118', '1'],
       ['1119', '1'],
       ['1120', '1'],
       ['1121', '1'],
       ['1122', '1'],
       ['1123', '1'],
       ['1124', '1'],
       ['1125', '1'],
       ['1126', '1'],
       ['1127', '1'],
       ['1128', '1'],
       ['1129', '1'],
       ['1130', '1'],
       ['1131', '1'],
       ['1132', '1'],
       ['1133', '1'],
       ['1134', '1'],
       ['1135', '1'],
       ['1136', '1'],
       ['1137', '1'],
       ['1138', '1'],
       ['1139', '1'],
       ['1140', '1'],
       ['1141', '1'],
       ['1142', '1'],
       ['1143', '1'],
       ['1144', '1'],
       ['1145', '1'],
       ['1146', '1'],
       ['1147', '1'],
       ['1148', '1'],
       ['1149', '1'],
       ['1150', '1'],
       ['1151', '1'],
       ['1152', '1'],
       ['1153', '1'],
       ['1154', '1'],
       ['1155', '1'],
       ['1156', '1'],
       ['1157', '1'],
       ['1158', '1'],
       ['1159', '1'],
       ['1160', '1'],
       ['1161', '1'],
       ['1162', '1'],
       ['1163', '1'],
       ['1164', '1'],
       ['1165', '1'],
       ['1166', '1'],
       ['1167', '1'],
       ['1168', '1'],
       ['1169', '1'],
       ['1170', '1'],
       ['1171', '1'],
       ['1172', '1'],
       ['1173', '1'],
       ['1174', '1'],
       ['1175', '1'],
       ['1176', '1'],
       ['1177', '1'],
       ['1178', '1'],
       ['1179', '1'],
       ['1180', '1'],
       ['1181', '1'],
       ['1182', '1'],
       ['1183', '1'],
       ['1184', '1'],
       ['1185', '1'],
       ['1186', '1'],
       ['1187', '1'],
       ['1188', '1'],
       ['1189', '1'],
       ['1190', '1'],
       ['1191', '1'],
       ['1192', '1'],
       ['1193', '1'],
       ['1194', '1'],
       ['1195', '1'],
       ['1196', '1'],
       ['1197', '1'],
       ['1198', '1'],
       ['1199', '1'],
       ['1200', '1'],
       ['1201', '1'],
       ['1202', '1'],
       ['1203', '1'],
       ['1204', '1'],
       ['1205', '1'],
       ['1206', '1'],
       ['1207', '1'],
       ['1208', '1'],
       ['1209', '1'],
       ['1210', '1'],
       ['1211', '1'],
       ['1212', '1'],
       ['1213', '1'],
       ['1214', '1'],
       ['1215', '1'],
       ['1216', '1'],
       ['1217', '1'],
       ['1218', '1'],
       ['1219', '1'],
       ['1220', '1'],
       ['1221', '1'],
       ['1222', '1'],
       ['1223', '1'],
       ['1224', '1'],
       ['1225', '1'],
       ['1226', '1'],
       ['1227', '1'],
       ['1228', '1'],
       ['1229', '1'],
       ['1230', '1'],
       ['1231', '1'],
       ['1232', '1'],
       ['1233', '1'],
       ['1234', '1'],
       ['1235', '1'],
       ['1236', '1'],
       ['1237', '1'],
       ['1238', '1'],
       ['1239', '1'],
       ['1240', '1'],
       ['1241', '1'],
       ['1242', '1'],
       ['1243', '1'],
       ['1244', '1'],
       ['1245', '1'],
       ['1246', '1'],
       ['1247', '1'],
       ['1248', '1'],
       ['1249', '1'],
       ['1250', '1'],
       ['1251', '1'],
       ['1252', '1'],
       ['1253', '1'],
       ['1254', '1'],
       ['1255', '1'],
       ['1256', '1'],
       ['1257', '1'],
       ['1258', '1'],
       ['1259', '1'],
       ['1260', '1'],
       ['1261', '1'],
       ['1262', '1'],
       ['1263', '1'],
       ['1264', '1'],
       ['1265', '1'],
       ['1266', '1'],
       ['1267', '1'],
       ['1268', '1'],
       ['1269', '1'],
       ['1270', '1'],
       ['1271', '1'],
       ['1272', '1'],
       ['1273', '1'],
       ['1274', '1'],
       ['1275', '1'],
       ['1276', '1'],
       ['1277', '1'],
       ['1278', '1'],
       ['1279', '1'],
       ['1280', '1'],
       ['1281', '1'],
       ['1282', '1'],
       ['1283', '1'],
       ['1284', '1'],
       ['1285', '1'],
       ['1286', '1'],
       ['1287', '1'],
       ['1288', '1'],
       ['1289', '1'],
       ['1290', '1'],
       ['1291', '1'],
       ['1292', '1'],
       ['1293', '1'],
       ['1294', '1'],
       ['1295', '1'],
       ['1296', '1'],
       ['1297', '1'],
       ['1298', '1'],
       ['1299', '1'],
       ['1300', '1'],
       ['1301', '1'],
       ['1302', '1'],
       ['1303', '1'],
       ['1304', '1'],
       ['1305', '1'],
       ['1306', '1'],
       ['1307', '1'],
       ['1308', '1'],
       ['1309', '1']], 
      dtype='|S11')

Kaggle Submission Results

Your submission scored 0.37321

Only 37% correct. Looks like we can do better!

The Overly Pessimistic Model

Let's create a model which predicts that all Titanic passengers die.

First open up the test data


In [74]:
test_data = open('./data/test.csv')
test_data_object = csv.reader(test_data)

Skip the first row because it's the header row


In [75]:
header = test_data_object.next()
header


Out[75]:
['PassengerId',
 'Pclass',
 'Name',
 'Sex',
 'Age',
 'SibSp',
 'Parch',
 'Ticket',
 'Fare',
 'Cabin',
 'Embarked']

Let's open up an output prediction model/csv-file


In [76]:
predictions_file = open('./models/jfaPythonModel-allDie.csv', 'wb')
predictions_file_object = csv.writer(predictions_file)
Write the columns header row

In [77]:
predictions_file_object.writerow(['PassengerID', 'Survived'])

Write in the prediction model that they all died


In [78]:
for passenger in test_data_object:
    predictions_file_object.writerow([passenger[0], "0"])

Close the test data file and the predicitons file


In [79]:
test_data.close()
predictions_file.close()

Take a look at the output predictions


In [80]:
output_predictions_file = open('./models/jfaPythonModel-allDie.csv', 'rb')
output_predictions_file_object = csv.reader(output_predictions_file)

In [81]:
data = []
for passenger in output_predictions_file_object:
    data.append(passenger[0:])
data = np.array(data)

In [82]:
data.shape


Out[82]:
(419, 2)

In [83]:
output_predictions_file.close()

Results

Your submission scored 0.62679.

Quite an improvement predicting almost 63% correct. But we can do better


In [ ]: